Improving Sequence to Sequence Neural Machine Translation by Utilizing Syntactic Dependency Information
نویسندگان
چکیده
Sequence to Sequence Neural Machine Translation has achieved significant performance in recent years. Yet, there are some existing issues that Neural Machine Translation still does not solve completely. Two of them are translation of long sentences and “over-translation”. To address these two problems, we propose an approach that utilize more grammatical information such as syntactic dependencies, so that the output can be generated based on more abundant information. In addition, the output of the model is presented not as a simple sequence of tokens but as a linearized tree construction. Experiments on the Europarl-v7 dataset of French-toEnglish translation demonstrate that our proposed method improves BLEU scores by 1.57 and 2.40 on datasets consisting of sentences with up to 50 and 80 tokens, respectively. Furthermore, the proposed method also solved the two existing problems, ineffective translation of long sentences and over-translation in Neural Machine Translation.
منابع مشابه
Sequence-to-Dependency Neural Machine Translation
Nowadays a typical Neural Machine Translation (NMT) model generates translations from left to right as a linear sequence, during which latent syntactic structures of the target sentences are not explicitly concerned. Inspired by the success of using syntactic knowledge of target language for improving statistical machine translation, in this paper we propose a novel Sequence-to-Dependency Neura...
متن کاملChinese-to-Japanese Patent Machine Translation based on Syntactic Pre-ordering for WAT 2016
This paper presents our Chinese-to-Japanese patent machine translation system for WAT 2016 (Group ID: ntt) that uses syntactic pre-ordering over Chinese dependency structures. Chinese words are reordered by a learning-to-rank model based on pairwise classification to obtain word order close to Japanese. In this year’s system, two different machine translation methods are compared: traditional p...
متن کاملIncorporating Syntactic Uncertainty in Neural Machine Translation with Forest-to-Sequence Model
Previous work on utilizing parse trees of source sentence in Attentional Neural Machine Translation was promising. However, current models suffer from a major drawback: they use only 1-best parse tree which may lead to translation mistakes due to parsing errors. In this paper we propose a forest-to-sequence Attentional Neural Machine Translation model which uses a forest instead of the 1-best t...
متن کاملPredicting Target Language CCG Supertags Improves Neural Machine Translation
Neural machine translation (NMT) models are able to partially learn syntactic information from sequential lexical information. Still, some complex syntactic phenomena such as prepositional phrase attachment are poorly modeled. This work aims to answer two questions: 1) Does explicitly modeling target language syntax help NMT? 2) Is tight integration of words and syntax better than multitask tra...
متن کاملSyntax-aware Neural Machine Translation Using CCG
Neural machine translation (NMT) models are able to partially learn syntactic information from sequential lexical information. Still, some complex syntactic phenomena such as prepositional phrase attachment are poorly modeled. This work aims to answer two questions: 1) Does explicitly modeling target language syntax help NMT? 2) Is tight integration of words and syntax better than multitask tra...
متن کامل